Homework 7 - More on joins

Author

Christopher Torres

Published

Invalid Date

Last week, we worked with setting up a database. For this assignment, we are going to take that one step further and add in a few more tables to give us more practice working with join statements inside of SQL!

GITHUB URL: https://github.com/cmsc-vcu/cmsc408-sp2025-hw7-serrotrehpotsirhc.git

from helpers import create_database_engine, run_sql_and_return_df, run_sql_and_return_html, create_db_wrapper, execute_ddl_from_file

# Load these variables from .env file.
config_map = {
  'user': "CMSC408_HW6_USER",
  'password': "CMSC408_HW6_PASSWORD",
  'host': "CMSC408_HW6_HOST",
  'database': "CMSC408_HW6_DB_NAME"
}

cnx,config = create_db_wrapper( config_map )

Overview and description

The database will represent a company-wide resume, capturing the collective skills and certifications of all the people in your company. The database will capture skills, employees and their roles within the company. The database will help our employer find employees with specific skills, list the skills of any employee and help find gaps in the skill sets of employees in specific roles.

Crows-foot diagram

This is the Crows Foot diagram for this database. People have skills. People can have many skills, and skills can each have many people. Both People and Skills are each One to Many toward each other, making both Many to Many.

erDiagram
    people ||--o{ peopleskills : has
    skills ||--o{ peopleskills : includes

    people {
        int id PK
        string first_name
        string last_name
        string email
        string linkedin_url
        string headshot_url
        string discord_handle
        string brief_bio
        date date_joined
    }

    skills {
        int id PK
        string name
        string description
        string tag
        string url
        string extra
    }

    peopleskills {
        int id PK
        int skills_id FK
        int people_id FK
        datetime date_acquired
    }

Loading the database

ddl_file_name = "./my-ddl.sql"
messages,errors = execute_ddl_from_file( ddl_file_name, cnx)

if errors:
    for error in errors:
        print(f"{error}<br/>")
else:
    print(f"No errors detected while loading: {ddl_file_name}")

No errors detected while loading: ./my-ddl.sql

Examples of data in the database

The following sections provide an overview of the schema including table names, and number of rows and columns in each table.

For the people, skills, and roles tables, a description of each table is presented along with it’s contents.

Tables and metrics in the database

table_name row_count column_count
0 countries 25 3
1 departments 27 4
2 emp_details_view 106 16
3 employees 107 11
4 job_grades 6 3
5 job_history 10 5
6 jobs 19 4
7 locations 23 6
8 people 10 9
9 peopleroles 15 4
10 peopleskills 26 4
11 regions 4 2
12 roles 6 3
13 skills 8 6

People table

The people table contains a SQL Select call to print the entire people table, using the *.

Below is a list of data in the people table.

run_sql_and_return_html( cnx,f"""
select * from people;
""" )
id first_name last_name email linkedin_url headshot_url discord_handle brief_bio date_joined
1 Avery Person 1 avery1@example.com https://linkedin.com/in/avery1 https://example.com/headshots/avery.jpg @avery01 Fitness enthusiast and tech blogger. 2025-01-01
2 Jordan Person 2 jordan2@example.com https://linkedin.com/in/jordan2 https://example.com/headshots/jordan.jpg @jordan02 Mechanical engineer who loves extreme sports. 2025-01-01
3 Taylor Person 3 taylor3@example.com https://linkedin.com/in/taylor3 https://example.com/headshots/taylor.jpg @taylor03 Writer and hobbyist swimmer. 2025-01-01
4 Morgan Person 4 morgan4@example.com https://linkedin.com/in/morgan4 https://example.com/headshots/morgan.jpg @morgan04 Artist focused on sustainable designs. 2025-01-01
5 Riley Person 5 riley5@example.com https://linkedin.com/in/riley5 https://example.com/headshots/riley.jpg @riley05 Racing enthusiast and cliff diver. 2025-01-01
6 Skylar Person 6 skylar6@example.com https://linkedin.com/in/skylar6 https://example.com/headshots/skylar.jpg @skylar06 Tech consultant with a passion for flying. 2025-01-01
7 Casey Person 7 casey7@example.com https://linkedin.com/in/casey7 https://example.com/headshots/casey.jpg @casey07 Athlete and mural enthusiast. 2025-01-01
8 Quinn Person 8 quinn8@example.com https://linkedin.com/in/quinn8 https://example.com/headshots/quinn.jpg @quinn08 Entrepreneur and adventure lover. 2025-01-01
9 Reese Person 9 reese9@example.com https://linkedin.com/in/reese9 https://example.com/headshots/reese.jpg @reese09 Freelancer and swimmer. 2025-01-01
10 Parker Person 10 parker10@example.com https://linkedin.com/in/parker10 https://example.com/headshots/parker.jpg @parker10 Strength coach and motivational speaker. 2025-01-01
Total Rows: 10, Total Columns: 9

Skills table

The skills table contains a SQL Select call to print the entire skills table, using the *.

Below is a list of data in the skills table.

run_sql_and_return_html( cnx,f"""
select * from skills;
""" )
id name description tag url time_commitment
1 Rope Jumping Jumping Rope Skill 1 www.google.com NULL
2 Sky Diving Jumping Rope Skill 2 www.google.com NULL
3 Race Car Driving Jumping Rope Skill 3 www.google.com NULL
4 Heavy Lifting Jumping Rope Skill 4 www.google.com NULL
5 Swimmer Jumping Rope Skill 5 www.google.com NULL
6 Cliff Jumping Jumping Rope Skill 6 www.google.com NULL
7 Mural Artist Jumping Rope Skill 7 www.google.com NULL
8 Author Jumping Rope Skill 8 www.google.com NULL
Total Rows: 8, Total Columns: 6

Roles table

The roles table contains the roles of each employee, which are defined in our MY-DL.SQl File. To call all of these roles for this python snippet, we wil use a select call with an * to print ALL the roles inside the table!

Below is a list of data in the roles table.

run_sql_and_return_html( cnx,f"""
select * from roles;
""" )
id name sort_priority
1 Designer 10
2 Developer 20
3 Recruit 30
4 Team Lead 40
5 Boss 50
6 Mentor 60
Total Rows: 6, Total Columns: 3

Sample queries

List skill names of Person 5

run_sql_and_return_html( cnx,f"""
SELECT s.name
FROM skills s
JOIN peopleskills ps ON s.id = ps.skills_id
WHERE ps.people_id = 5;
""" )
name
Race Car Driving
Cliff Jumping
Total Rows: 2, Total Columns: 1

List people with Skill 2

run_sql_and_return_html( cnx,f"""
SELECT p.first_name, p.last_name
FROM people p
JOIN peopleskills ps ON p.id = ps.people_id
WHERE ps.skills_id = 2;
""" )
first_name last_name
Skylar Person 6
Reese Person 9
Total Rows: 2, Total Columns: 2

List people with a DEVELOPER role

run_sql_and_return_html( cnx,f"""
SELECT CONCAT(p.first_name, ' ', p.last_name) AS person_name,
       r.name AS role_name
FROM peopleroles pr
JOIN roles r ON pr.role_id = r.id
JOIN people p ON pr.people_id = p.id
WHERE r.name = 'Developer';
""" )
person_name role_name
Avery Person 1 Developer
Taylor Person 3 Developer
Skylar Person 6 Developer
Reese Person 9 Developer
Parker Person 10 Developer
Total Rows: 5, Total Columns: 2

List names and email addresses of people without skills

run_sql_and_return_html( cnx,f"""
SELECT p.first_name, p.last_name, p.email
FROM people p
LEFT JOIN peopleskills ps ON p.id = ps.people_id
WHERE ps.skills_id IS NULL;
""" )
first_name last_name email
Morgan Person 4 morgan4@example.com
Total Rows: 1, Total Columns: 3

List names and tags of unused skills

run_sql_and_return_html( cnx,f"""
SELECT s.name, s.tag
FROM skills s
LEFT JOIN peopleskills ps ON s.id = ps.skills_id
WHERE ps.people_id IS NULL;
""" )
name tag
Mural Artist Skill 7
Author Skill 8
Total Rows: 2, Total Columns: 2

List people names and skill names with the BOSS role

run_sql_and_return_html( cnx,f"""
SELECT CONCAT(p.first_name, ' ', p.last_name) AS person_name,
       s.name AS skill_name,
       r.name AS role_name
FROM peopleroles pr
JOIN roles r ON pr.role_id = r.id
JOIN people p ON pr.people_id = p.id
JOIN peopleskills ps ON ps.people_id = p.id
JOIN skills s ON s.id = ps.skills_id
WHERE r.name = 'Boss';
""" )
person_name skill_name role_name
Jordan Person 2 Race Car Driving Boss
Jordan Person 2 Heavy Lifting Boss
Jordan Person 2 Swimmer Boss
Total Rows: 3, Total Columns: 3

List ids and names of unused roles

run_sql_and_return_html( cnx,f"""
SELECT r.id, r.name
FROM roles r
LEFT JOIN peopleroles pr ON r.id = pr.role_id
WHERE pr.people_id IS NULL;
""" )
id name
no records returned
Total Rows: 1, Total Columns: 2

List people and the number of skills they have acquired.

(Goal: Practice JOIN, GROUP BY, and COUNT())

run_sql_and_return_html( cnx,f"""
SELECT
  p.id AS person_id,
  CONCAT(p.first_name, ' ', p.last_name) AS person_name,
  COUNT(ps.skills_id) AS skill_count
FROM
  people p
  LEFT JOIN peopleskills ps ON p.id = ps.people_id
GROUP BY
  p.id, p.first_name, p.last_name
ORDER BY
  skill_count DESC, person_name;
""" )
person_id person_name skill_count
8 Quinn Person 8 4
1 Avery Person 1 3
7 Casey Person 7 3
2 Jordan Person 2 3
10 Parker Person 10 3
9 Reese Person 9 3
6 Skylar Person 6 3
5 Riley Person 5 2
3 Taylor Person 3 2
4 Morgan Person 4 0
Total Rows: 10, Total Columns: 3

List each skill and the number of people who have acquired it.

(Goal: Practice JOIN, GROUP BY, COUNT())

run_sql_and_return_html( cnx,f"""
SELECT
  skills.tag,
  COUNT(peopleskills.skills_id)
FROM
  skills
    left join peopleskills
      on (skills.id=peopleskills.skills_id)
GROUP BY
  skills.tag
ORDER BY
  COUNT(peopleskills.skills_id) DESC
""" )
tag COUNT(peopleskills.skills_id)
Skill 3 6
Skill 5 6
Skill 6 5
Skill 1 4
Skill 4 3
Skill 2 2
Skill 7 0
Skill 8 0
Total Rows: 8, Total Columns: 2

List all people who have acquired more than 2 skills

(Goal: Join through multiple tables and use HAVING)

run_sql_and_return_html( cnx,f"""
SELECT
  p.id AS person_id,
  CONCAT(p.first_name, ' ', p.last_name) AS person_name,
  COUNT(ps.skills_id) AS skill_count
FROM
  people p
  JOIN peopleskills ps ON p.id = ps.people_id
GROUP BY
  p.id, p.first_name, p.last_name
HAVING
  COUNT(ps.skills_id) > 2
ORDER BY
  skill_count DESC, person_name;
""" )
person_id person_name skill_count
8 Quinn Person 8 4
1 Avery Person 1 3
7 Casey Person 7 3
2 Jordan Person 2 3
10 Parker Person 10 3
9 Reese Person 9 3
6 Skylar Person 6 3
Total Rows: 7, Total Columns: 3

Reflection

What did you like most about this project?
This project retained the architecture of Homework 6, so it was easy to get started, and I was less worried about trying to get everything working this time around. I wish more assignments were like this, instead of every assignment being daunting just to get up and running. By eliminating this step, I was actually able to focus on the work and apply concepts from class!
What was most challenging about this project?
Very little difficulty here. Logically working through how to define and populate the roles table, but that was straightforward from Homework 6. The python code in Report.QMD was difficult at first. It was more conceptualizing what tables I needed to join rather than simply implementing it!
What would you do different next time?
Not much! I love watching the lectures! I would probably run through a few more problem sets of this code. There should be a website graded for completion which generates SQL commands, where you can go through a bunch of implementation problems without worrying about your entire program not running or whatnot! This would isolate the skills of SQL without causing the stress of worrying that a mistake in code will be the reason the entire program doesn’t work! Over time, you would get better at SQL and be confident enough to work through implementation issues!

README

Below is the README from my project.

CMSC408 - Fall 2024 - Homework 7

Hello!

Last week, in Homework 6, I set up connection to the remote HR database, and started to work with tables defined and popoulated in the MY-DDL.SQL file from Homework 6! We used Select statements last week, to pull the tables we wanted to from the database using SQL commands! This week, we are going to repeat the basic information of Homework 6, while implementing addition SQL problem sets using JOIN commands to incorporate the newly formed Roles and Peopleroles tables inside this newly updated MY-DDL.SQL file in Homework 7!

We are going to be writing the SQL commands to manipulate the HR database through python code inside of:

Report.QMD

We are going to create and populate the Roles and Peopleroles tables using DDL inside of:

MY-DDL.SQL

Final Notes

Each of these files are inside the reports folder. Our credentials are still stored in the .env file. We need SQL, Quarto, Poetry, and Pandas installed, with the SQLTools extension to work through SQL commands. We will use Quarto to render the Report.QMD file into an HTML deliverable!

Thank you!